Universität Augsburg Audio Brush : Smart Audio Editing in the Spectrogram
نویسنده
چکیده
Starting with a novel audio analysis and editing paradigm, a set of new and adaptive audio analysis and editing algorithms in the spectrogram are developed and integrated into a smart visual audio editing tool in a “what you see is what you hear” style. At the core of our algorithms and methods is a very flexible audio spectrogram that goes beyond FFT and Wavelets and supports manipulating a signal at any chosen time-frequency resolution: the Gabor analysis and synthesis. It gives maximum accuracy of the representation, is fully invertible, and enables resolution zooming. Simple audio objects are localized in time and frequency. They can easily be identified visually and selected by simple geometric selection masks such as rectangles, combs and polygons. For many audio objects, however the structures in the spectrogram are rather complex. Therefore, we present several intelligent and adaptive mask selection approaches. They are based on audio fingerprinting and visual pattern matching algorithms. Spectrograms of individually recorded sounds under controlled conditions or interactively selected in the current spectrogram can be regarded as visual and sophisticated templates. In this paper we discuss how to generate templates, how to find the best match out of a database of templates and how to adapt the match to the sound which we want to edit.
منابع مشابه
Universität Augsburg Audio Brush : Editing Audio in the Spectrogram
A tool for editing audio signals in the spectrogram is presented. It allows manipulating the spectrogram of a signal at any chosen time-frequency resolution directly and to reconstruct the edited signal in HiFi quality – a capability that is usually not possible with the Fourier or wavelet transformation. Image processing and computer vision methods are applied to the spectrogram in order to id...
متن کاملCipher text only attack on speech time scrambling systems using correction of audio spectrogram
Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...
متن کاملVisual Audio: An Interactive Tool for Analyzing and Editing of Audio in the Spectrogram
We present a tool for analyzing and editing audio signals in the visual domain. As visual representation we use spectrograms, which give descriptive information about the sound. This allows analysing and editing audio in a " what you see is what you hear " style. Gabor analysis and synthesis serves as a basis to create images and recreate audio signals from the edited images in hi-fi quality. A...
متن کاملAvlaughtercycle: an Audiovisual Laughing Machine
The AVLaughterCycle project aims at developing an audiovisual laughing machine, capable of recording the laughter of a user and to respond to it with a machine-generated laughter linked with the input laughter. During the project, an audiovisual laughter database was recorded, including facial points tracking, thanks to the Smart Sensor Integration software developed by the University of Augsbu...
متن کاملSimplifying Video Editing Using Metadata
Digital video is becoming increasingly ubiquitous. However, editing video remains difficult for several reasons: it is a time-based medium, it has dual tracks of audio and video, and current tools force users to work at the smallest level of detail. Based on interviews with professional video editors, we developed a video editor, called Silver, that uses metadata to make digital video editing m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006